Skip to content

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup#7

Merged
harshitsinghbhandari merged 3 commits into
feat/lcm-sm-contractsfrom
session/aa-5
May 27, 2026
Merged

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup#7
harshitsinghbhandari merged 3 commits into
feat/lcm-sm-contractsfrom
session/aa-5

Conversation

@harshitsinghbhandari

Copy link
Copy Markdown
Collaborator

Summary

Implements ports.SessionManager (the explicit-mutation half of the LCM+SM lane) in backend/internal/session/, against fakes for the outbound ports. The SM drives the Runtime/Agent/Workspace plugins, seeds the initial lifecycle record, and routes mutation outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read via DeriveLegacyStatus, never persisted).

Base is feat/lcm-sm-contracts (the integration branch), not main.

Pipelines

  • Spawn: Workspace.Create → Runtime.Create (env AO_SESSION_ID/AO_PROJECT_ID/AO_ISSUE_ID layered over the agent env) → Seed → LCM.OnSpawnCompleted. Eager rollback of completed steps on any failure (the record is seeded late, so an unseeded orphan would be invisible to Cleanup). If OnSpawnCompleted fails after seeding, best-effort routes the orphan to errored via OnKillRequested(KillError) since the store has no delete.
  • Kill: LCM.OnKillRequested first → Runtime.Destroy → Workspace.Destroy, honoring the worktree-remove safety — a refusal (path still registered after prune ⇒ may hold uncommitted work) surfaces as an error with WorkspaceFreed:false and is never forced.
  • List/Get: read records, attach derived status; single producer, never persisted.
  • Send: via AgentMessenger.
  • Restore: fallible I/O first (workspace restore + runtime create, never destroys the worktree on failure) → re-seed/reopen lifecycle (not_started/spawn_requested, PR cleared_on_restore) → relaunch via Agent.GetRestoreCommandOnSpawnCompleted.
  • Cleanup: reclaims terminal sessions' workspaces, skips paths holding uncommitted work → CleanupResult{Cleaned, Skipped}.

⚠️ Store-contract additions (need @tom's review)

OnSpawnCompleted requires a pre-seeded record, but LifecycleStore had no way to create one with identity, and single-id reads (Get/Kill) had no record-with-identity accessor (Load is lifecycle-only). Added two methods, discussed and approved with the lane owner but flagged here since they touch the persistence contract:

  • Seed(ctx, SessionRecord) error — explicit create-with-identity; errors if the id already exists (re-seed on Restore goes through PatchLifecycle).
  • Get(ctx, SessionID) (SessionRecord, bool, error) — single full record read (symmetric with List).

The lifecycle package's test fake was updated to satisfy the wider interface; LCM tests remain green. No other LCM / decide-core / contract changes.

Out of scope / flags

  • preflight appears in the spec but no outbound port exposes it → treated as out of scope.
  • "Agent stop" in the kill pipeline has no port method → subsumed by Runtime.Destroy (the agent runs inside the runtime).
  • Real tmux/claude-code/worktree adapters, full 3-layer prompt assembly, orchestrator-as-session, real CDC — all out of scope per the lane plan.

Test plan

Tests route through the real LCM Manager (wrapped to record call order):

  • Spawn happy path — record seeded with identity, OnSpawnCompleted flips runtime→alive, AO_* env wired, handles in metadata, status spawning
  • Spawn runtime-create failure — eager workspace rollback, no record seeded, no OnSpawnCompleted
  • Kill ordering (OnKillRequested < Runtime.Destroy < Workspace.Destroy) + terminal killed status
  • Kill worktree-remove refusal surfaced (WorkspaceFreed:false, not force-deleted)
  • List/Get derive status (table) + Get not-found
  • Send routes to messenger
  • Restore relaunch via resume command + reopened lifecycle
  • Cleanup skips uncommitted-work worktrees, ignores live sessions

Gates: gofmt -l . empty · go build ./... · go vet ./... · go test -race ./... — all green.

🤖 Generated with Claude Code

…store/cleanup)

Implements ports.SessionManager against fakes for the outbound ports. The SM is
the explicit-mutation half of the lane: it drives Runtime/Agent/Workspace, seeds
the initial lifecycle, and routes outcomes to the LCM (OnSpawnCompleted /
OnKillRequested). It never derives observed state and is the single producer of
the derived display status (attached on read, never persisted).

- Spawn: Workspace.Create -> Runtime.Create (AO_* identity env) -> Seed ->
  OnSpawnCompleted, with eager rollback of completed steps on failure.
- Kill: OnKillRequested first -> Runtime.Destroy -> Workspace.Destroy, honoring
  the worktree-remove safety (refusal surfaced, never forced).
- List/Get: derive status via DeriveLegacyStatus. Send: via AgentMessenger.
  Restore: re-seed (reopen) + relaunch via GetRestoreCommand. Cleanup: reclaim
  terminal sessions, skip worktrees holding uncommitted work.

Store-contract additions (co-owned with Tom's persistence layer, flagged for
review): LifecycleStore.Seed (explicit create-with-identity; OnSpawnCompleted
requires a seeded record) and LifecycleStore.Get (single record-with-identity
read; Load is lifecycle-only). Lifecycle test fake updated to satisfy both.

Tests route through the real LCM Manager (wrapped to record call order).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a new ports.SessionManager in backend/internal/session/ to own explicit session mutations (spawn/kill/send/restore/cleanup) while delegating canonical lifecycle transitions to the existing LCM, and deriving the display Status only on reads.

Changes:

  • Added session.Manager implementing the SM pipelines for Spawn/Kill/List/Get/Send/Restore/Cleanup with rollback behavior on spawn failures.
  • Extended ports.LifecycleStore with Seed (create-with-identity) and Get (single record read) to support SM needs beyond lifecycle-only Load.
  • Added an in-package fake harness and tests that exercise SM behavior against the real lifecycle manager plus faked outbound ports.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
backend/internal/session/manager.go New Session Manager implementation wiring outbound ports, store seeding, LCM notifications, and read-model derivation.
backend/internal/session/manager_test.go SM tests covering spawn/kill/list/get/send/restore/cleanup flows using a harness.
backend/internal/session/fakes_test.go Fakes for store/runtime/agent/workspace/messenger + recording LCM wrapper to assert pipeline ordering.
backend/internal/ports/outbound.go Adds Seed and Get to LifecycleStore contract with explanatory docs.
backend/internal/lifecycle/fakes_test.go Updates lifecycle test fake store to satisfy the expanded LifecycleStore interface.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/internal/session/manager.go
Comment thread backend/internal/session/manager.go
Comment thread backend/internal/session/manager_test.go

@harshitsinghbhandari harshitsinghbhandari left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — Session Manager (reviewed via subagent + independent verification)

Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session at 67.5% coverage. Spec-faithful and the invariants hold — display status is derived only on read (toSession) and never persisted, kills route through LCM.OnKillRequested, and the tests exercise the real lifecycle.Manager (not a fake LCM) for the round-trips, which is exactly right. Two should-fix items before this is clean; the rest is minor/doc-level.

1. LifecycleStore.Seed + Get — clean ✅

Seed taking a full SessionRecord and rejecting an existing id (create-only), with Restore routed through PatchLifecycle instead, cleanly separates create from reopen and avoids a double-seed footgun. Get is not redundant with Load (lifecycle-only, hot path) or List (project-scoped) — the SM needs identity by id for teardown/read-model. Both are ordinary CRUD for a real store. Minor: a one-line doc noting Load = hot-path lifecycle-only vs Get = full record would help Tom.

2. Spawn rollback + orphan handling — logic right, path untested (should-fix)

Eager rollback is correct (late seed; each step unwinds the prior). The OnSpawnCompleted-failure orphan route (seed + runtime up, then route to terminal-errored via OnKillRequested(KillError) + teardown) is the right call — but there is no test for it, and it's the most subtle path flagged. Add a test (fake LCM whose OnSpawnCompleted errors) asserting terminal-errored + OnKillRequested(KillError) fired + runtime/workspace torn down. Also untested: workspace-create-failure rollback, and Seed duplicate-id rejection.

3. Kill ordering + worktree-remove refusal — correct & tested ✅

OnKillRequested → Runtime.Destroy → Workspace.Destroy, asserted by call-order indices; refusal surfaces as an error with WorkspaceFreed=false and the path is never force-deleted. Note: the spec's explicit "Agent stop" step is folded into Runtime.Destroy (agent runs inside the runtime) — defensible, but call it out in the PR body so the Agent port's stop semantics aren't assumed elsewhere. Minor: Kill reconstructs handles from metadata; a seeded-but-never-completed session yields empty handle/path to Destroy — should be a safe no-op but isn't guarded/tested.

4. Restore — no rollback → runtime leak + stranded state (should-fix)

Restore does Workspace.Restore → Runtime.Create → PatchLifecycle(reopen) → OnSpawnCompleted, but on a failure at the reopen or OnSpawnCompleted step it returns the error without destroying the runtime it just created → orphan runtime. Worse, if OnSpawnCompleted fails after the reopen patch landed, the record is left not_started (display: spawning) with a live runtime and no recovery path. Spawn handles exactly this; Restore should mirror it (roll back the runtime on reopen/OnSpawnCompleted failure). No Restore failure path is tested. The reopen itself (PR → none/cleared_on_restore, session → not_started/spawn_requested, via PatchLifecycle not Seed) is correct and tested on the happy path.

5. Tests against the real LCM — yes ✅

recordingLCM wraps the real lifecycle.New(...) and only logs call order, so OnSpawnCompleted/OnKillRequested run the real load→decide→persist pipeline and the kill test asserts the LCM produced terminated/manually_killed. Coverage gaps are the error paths above (orphan route, all Restore-failure paths) — that's what's holding 67.5%.

Minor

  • Send has no existence/terminal guard — forwards blindly to AgentMessenger. Probably fine (messenger owns busy-detect/verify), but confirm it's intentional.
  • Cleanup drops a runtime-destroy error silently even when the workspace is later reclaimed — fine for terminal sessions, undocumented.
  • buildPrompt is a documented stub for the 3-layer assembly — fine for this split.

Recommendation: Approve-with-follow-ups. Sound design, invariants hold, real-LCM testing is exactly right. Two before-clean items: (4) give Restore the same rollback Spawn has, and (2) test the OnSpawnCompleted-failure orphan route + Restore error paths.

harshitsinghbhandari and others added 2 commits May 27, 2026 13:50
…ntime on post-create failure (PR #7 review)

- Restore now fails early with a clear error if MetaAgentSessionID is missing,
  rather than emitting an ambiguous "resume nothing" launch command (no stored
  prompt means a fresh-launch fallback isn't possible).
- On a post-runtime-create failure (reopen patch or OnSpawnCompleted), best-effort
  destroy the newly created runtime (never the workspace, which holds prior work)
  so we don't strand a live process while parking the session terminal.
- Added a test for Restore with a missing agent session id: errors early, touches
  no workspace/runtime, leaves the session terminal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e rollback (review follow-ups)

Adds an injectable OnSpawnCompleted failure to the recording LCM and two tests:
- Spawn: when OnSpawnCompleted fails, the seeded record is parked terminal/errored
  (via OnKillRequested(KillError)) and runtime+workspace are torn down.
- Restore: when OnSpawnCompleted fails post-create, the new runtime is destroyed
  while the workspace is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@harshitsinghbhandari harshitsinghbhandari left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review — 4c331d9

Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session coverage 67.5% → 72.7%. Both should-fix items resolved.

  • #4 Restore rollback ✅ — rollbackRuntime(handle) now fires on both the reopen-patch failure and the OnSpawnCompleted failure, tearing down the freshly-created runtime while preserving the workspace (right call — it may hold uncommitted work). The early missing-agent-session-id guard is a good addition: it bails before any I/O instead of launching an ambiguous "resume nothing."
  • #2 tests ✅ — TestSpawn_OnSpawnCompletedFailure_RoutesOrphanToErrored (seeded orphan → terminal/errored, since the store has no delete) and TestRestore_OnSpawnCompletedFailure_RollsBackRuntime (runtime destroyed, workspace preserved) cover exactly the previously-untested paths.
  • Minors — Send-delegates-to-messenger (by design), Cleanup best-effort runtime-destroy (documented), buildPrompt stub: all acknowledged, agreed as-is.

LGTM — approving. Clean work, and the real-LCM round-trip testing throughout this PR was the right call.

@harshitsinghbhandari harshitsinghbhandari merged commit 6cf7c04 into feat/lcm-sm-contracts May 27, 2026
2 checks passed
neversettle17-101 added a commit that referenced this pull request Jun 12, 2026
…viewer to worker harness

Per PR #197 review feedback:
- Reviewer agent posts its review to the PR itself, so remove the
  ports.PRReviewPoster port, the GitHub review poster, the submit HTTP
  route + DTO, and the service Submit method (#1, #4, #7).
- Trigger spawns the reviewer agent over the worker's worktree with its
  own review prompt, mirroring the session launch flow (resolve agent by
  harness -> argv -> runtime.Create) (#8, #9).
- Default reviewer harness reuses the worker's harness when supported,
  falling back to claude-code; reviewer config stays independent of the
  worker override (#5, #6).
- Drop the `ao review` CLI for this PR's scope (#2, #3).

Regenerated OpenAPI + TS types.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Vaibhaav-Tiwari pushed a commit that referenced this pull request Jun 14, 2026
* feat(review): configurable AO code review backend (V1)

Add per-project configurable code review of a worker's PR. A reviewer
agent runs one-shot over the worker's own worktree and posts its result
to the PR; the worker picks the feedback up through the existing SCM
observer review-nudge path.

- domain: ProjectConfig.reviewers (+ default reviewer harness), Review /
  ReviewRun types and verdict/status vocab.
- storage: review + review_run tables (0011), sqlc queries, store methods.
- service/review: rewrite the in-memory stub as a persisted ReviewService
  (Trigger/Submit/List) with a reviewer Runner over agent resolver +
  runtime; ports.PRReviewPoster implemented on the GitHub adapter.
- http: session-scoped routes POST /sessions/{id}/reviews/trigger,
  POST .../submit, GET .../reviews; regenerated OpenAPI + TS types.
- cli: ao review trigger|submit|list.
- frontend: adapt ReviewDashboard to the per-worker reviews API.

Closes #192

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness

Per PR #197 review feedback:
- Reviewer agent posts its review to the PR itself, so remove the
  ports.PRReviewPoster port, the GitHub review poster, the submit HTTP
  route + DTO, and the service Submit method (#1, #4, #7).
- Trigger spawns the reviewer agent over the worker's worktree with its
  own review prompt, mirroring the session launch flow (resolve agent by
  harness -> argv -> runtime.Create) (#8, #9).
- Default reviewer harness reuses the worker's harness when supported,
  falling back to claude-code; reviewer config stays independent of the
  worker override (#5, #6).
- Drop the `ao review` CLI for this PR's scope (#2, #3).

Regenerated OpenAPI + TS types.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(review): restore ao review submit (records verdict+body in AO)

Per maintainer request, bring back `ao review submit`. AO records the
reviewer's verdict and body on the review_run and marks the pass complete;
it does not post to GitHub — the reviewer agent posts its review to the PR
itself.

- storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult.
- service: restore Submit (no SCM poster) storing verdict + body.
- http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput.
- cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER).
- runner: reviewer prompt instructs posting to GitHub and recording via ao review submit.

Regenerated OpenAPI + TS types.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): move reviewer runner to its own package; sharpen prompt

Per PR #197 review:
- Move the concrete reviewer runner out of the service layer into a new
  internal/review_runner package (package reviewrunner), beside other
  orchestration packages like session_manager. The service keeps only the
  Runner interface + RunSpec it depends on; the agent-resolver + runtime
  launch flow lives in review_runner.
- Sharpen the reviewer prompt: tell the agent to diff against the PR base,
  focus on high-confidence findings, post via `gh pr review`, and record
  the result with `ao review submit`; review-only (no commits/edits).
- Add unit tests for the runner.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): simplify review_run schema; provider-agnostic reviewer prompt

Per PR #197 review:
- review_run: status default 'running' (drop 'pending'), drop CHECK
  constraints on status/verdict, drop the updated_at column and the
  session/iteration index. Propagated through queries, domain, store,
  service, and tests.
- Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the
  agent to use whatever review tooling the provider offers, keeping the
  flow extensible across SCM providers.

Regenerated sqlc + OpenAPI/TS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): launch reviewer before persisting the run

Trigger now spawns the reviewer agent first and then writes the review_run
with a status derived from the launch outcome (running on success, failed
if it never started), instead of inserting a running row and correcting it
to failed afterwards.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): pluggable reviewer registry distinct from worker harnesses

Reviewers are now their own pluggable adapter set, separate from the worker
agent registry — adding a reviewer (claude-code today, greptile tomorrow) is
a one-line registration that does not widen the worker harness vocabulary,
and a worker harness does not automatically become a valid reviewer.

- domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with
  its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness
  reuses the worker harness only when it is itself a supported reviewer, else
  falls back to claude-code.
- ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env)
  that models one-shot / non-prompt CLIs natively instead of forcing every
  reviewer through the worker's interactive GetLaunchCommand(Prompt:...).
- internal/adapters/reviewer: a separate registry + resolver (mirrors the
  worker agent registry) with the claude-code reviewer adapter, which owns the
  review prompt and reuses the worker claude-code launch construction.
- review_runner resolves via the reviewer registry (not the worker
  AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env.
- daemon wires the reviewer resolver. Registry/domain parity is test-enforced.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(review): cover run-scoped reviewer submit

* fix(api): update generated review submit schema

* refactor(review): split core engine (internal/review) from API service

Move the review orchestration (Trigger/Submit/List, run-id generation,
deps, RunSpec/Runner, sentinels) into a transport-independent core package
internal/review (Engine). internal/service/review is now a thin API-flow
boundary: the controller-facing Manager interface + a Service that delegates
to the engine + error re-exports.

This keeps the service layer to API concerns and lets the same engine back a
future in-process CLI trigger without going through HTTP. review_runner now
depends on the core package; daemon builds the engine and wraps it in the
service. No API/schema changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(review): commit-aware trigger, reviewer handle for UI, no env vars

Reworks the review trigger lifecycle and drops env-based coupling:

- review_run gains target_sha (the reviewed commit) and drops iteration.
  A repeat trigger for the same PR head short-circuits to the existing run.
- review gains reviewer_handle_id: the live reviewer pane's runtime handle,
  reused across passes and exposed in the reviews API so the UI can attach
  its terminal over /mux.
- Trigger flow: if a live reviewer pane exists and a new commit arrived,
  message it to re-review; otherwise spawn a fresh reviewer. The run is
  recorded only after the reviewer is launched.
- No environment variables: the reviewer adapter embeds the explicit
  `ao review submit --session <w> --run <id>` command in the spawn prompt
  and the re-review message. CLI submit requires --run/--session (no env
  fallbacks).
- Merge review_runner into internal/review as a Launcher (spawn/notify/alive).
- Trigger returns 201 for a new pass, 200 when reusing an existing run.

Regenerated sqlc + OpenAPI/TS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): author the reviewer prompt centrally, not in the adapter

Mirror the worker model (session_manager builds the prompt; adapters just
place it via LaunchConfig.Prompt). The reviewer prompt now lives in
internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt;
the claude-code reviewer adapter just feeds inv.Prompt to its launch command
and returns it as the re-review message. One-shot CLI reviewers may ignore it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts

Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes
in the system prompt, the per-pass task (PR/commit + exact `ao review submit`
command) goes in the user prompt. internal/review/prompt.go now returns
(prompt, systemPrompt); both flow through ports.ReviewInvocation and the
claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The
re-review message reuses the per-pass prompt (role already established in the
running pane).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vaibhaav <user@example.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants